Recursive X-Y cut using bounding boxes of connected components

نویسندگان

  • Jaekyu Ha
  • Robert M. Haralick
  • Ihsin T. Phillips
چکیده

A top-down page segmentation technique known as the recursive X-Y cut decomposes a document image recursively into a set of rectanguzar blocks. This paper proposes that the recursive X-Y cut be implemented using bounding bozes of connected components of black pixels instead of using image pizels. The advantage is that great improvement can be achieved in computation. In fact, once bounding boxes of connected components are obtained, the recursive X-Y cut is completed within an order of a second on Spare-10 workutations for letter-sized document images scanned at 300 dpi resolution. keywords: page segmentation, recursive X-Y cut, projection profile, connected components

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Segmentation for Document Images by Successively Merging Adjacent Character Bounding Boxes by Iterative Dilation

A new method of word segmentation for document images is presented. The method uses the bounding box regions to enclose the letters (characters) of the words and then the resulting letter spaces are progressively filled to merge the character bounding boxes to get the word bounding boxes. The method holds good for inclined and irregularly distributed words. The proposed method completely avoids...

متن کامل

Document page decomposition by the bounding-box project

This paper describes a method for extracting words, textlines and text blocks by analyzing the spatial configuration of bounding boxes of connected components on a given document image. The basic idea is that connected components of black pixels can be used as computational units in document image analysis. In this paper, the problem of extracting words, textlines and text blocks is viewed as a...

متن کامل

Automation of Hole-Cutting for Overset Grids Using the X-rays Approach

Overset grids resolve complicated geometries by creating high quality structured grids that are independently built for each component. This simplifies the process of grid generation, but domain connectivity must be performed so that adjacent grids share information. Shortcomings of the current X-rays method for the hole-cutting step of domain connectivity were explored to automate the X-rays a...

متن کامل

Arbitrary-Shape Object Localization Using Adaptive Image Grids

Sliding-window based search is a widely used technique for object localization. However, for objects of non-rectangle shapes, noises in windows may mislead the localization, causing unsatisfactory results. In this paper, we propose an efficient bottom-up approach for detecting arbitrary-shape objects using image grids as basic components. First, a test image is partitioned into n × n grids and ...

متن کامل

Technical Report TR - 2004 - 024 Concerning Cut Point Spaces of Order Three

A point p of a topological space X is a cut point of X if X − {p} is disconnected. Further, if X −{p} has precisely m components for some natural number m ≥ 2 we will say that p has cut point order m. If each point y of a connected space Y is a cut point of Y , we will say that Y is a cut point space. Herein we construct a space S so that S is a connected Hausdorff space and each point of S is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995